ENH: Two-level content-addressed CastXML cache + pkl stamp fix by hjmjohnson · Pull Request #6486 · InsightSoftwareConsortium/ITK

hjmjohnson · 2026-06-21T23:49:17Z

Add a two-level content-addressed CastXML cache (ITK_WRAP_CASTXML_CACHE, default ON) that
avoids re-running CastXML when the input headers and compiler flags are unchanged, plus
remove the vestigial igenerator-level cache that masked missing pkl files.

Cache design

L1 (no subprocess): sha256 of .cxx content + .castxml.inc flags → L2 key
L2 (content-only): sha256 of castxml -E preprocessor output → output.xml.gz

L2 keys are path-independent, so multiple worktrees and CI agents share one store.
Cache root: ~/.cache/itk-wrap (override via ITK_WRAP_CACHE).
ITK_WRAP_CACHE_VERBOSE=1 logs HIT/MISS per file.

igenerator cache removal

The igenerator-level LRU cache restored .i/.idx/.index.txt but silently skipped
.pkl files (names not known at cache-write time). This caused pyi_generator.py to
fail with "No pickle files were found" on incremental rebuilds. Removing the cache
eliminates the failure mode; the CastXML cache recovers the wall-clock savings.

A per-module <Module>.pkl.stamp file (new --pkl_stamp argument to igenerator.py)
is declared as a CMake OUTPUT so ninja can track pkl-file completeness without
enumerating the 1028 individual pkl paths at configure time.

Benchmark results (72-core Linux, local cache)

Build	Condition	Wall-clock	CastXML cache	ccache rate
6	warmup (fills ccache)	6m47s	cold	99%
7	cold CastXML cache, warm ccache	6m39s	cold (seeds)	100%
8	warm cache, same build dir	6m7s	816/816 hits	100%
9	warm cache, fresh build dir	6m29s	816/816 hits	99%
10	ninja-warm (incremental)	1s	n/a	0 compilations

Same-dir savings: 32 s (8%). Cross-dir savings: 10 s (2%).
CastXML parallelizes well on 72 cores, limiting the cache speedup; igenerator
and SWIG generation dominate the remaining wall time.

Documentation

Documentation/docs/contributing/wrapping_architecture.md added:
full pipeline reference from .wrap files through configure-time generation,
CastXML, igenerator, SWIG, compilation, linking, and pyi generation.

hjmjohnson · 2026-06-22T05:21:16Z

/azp run ITK.macOS.Python

Ubuntu-22.04 and ubuntu-24.04 hosted agents ship Android SDK (~9 GB), Haskell/GHCup (~5 GB), .NET (~2-3 GB), Swift (~1.5 GB), CodeQL (~2 GB), and Boost headers (~1.2 GB). ITK's Linux builds use none of these; removing them at job start recovers ~20 GB before checkout, ccache restore, and the build itself consume disk.

Add ITK_WRAP_CASTXML_CACHE option (default OFF). Wraps castxml with a two-level cache: L1 (no subprocess): sha256 of binary content-hash + inc + cxx L2 (content-only): sha256 of castxml -E output, markers stripped L1 hit restores gzip-compressed XML with no castxml process. L2 keys are path-independent; worktrees share the same store. Binary fingerprinted by content hash so ninja -t clean reuses L1. LRU eviction via background fork; 2 GiB cap (ITK_WRAP_CACHE_MAX_SIZE). igenerator.py gains matching LRU eviction and bypass flag.

…cache Extend ITK_WRAP_CACHE to a colon-separated list of roots (like PATH). Reads search each root in order; writes go to the first that accepts an atomic rename. A read-only shared NFS cache can follow a writable SSD: export ITK_WRAP_CACHE=/local/ssd/cache:/nfs/lab/shared-cache Students get L2 hits from the shared cache while storing L1 maps locally. Add ITK_WRAP_CACHE_FORMAT=uncompressed: stores plain XML and restores via os.link() when cache and build share a filesystem, so A/B/C/D test builds each cost one L2 inode rather than N copies. Falls back to shutil.copy2() on cross-device links. gzip remains the default. Unlink output_xml before a full castxml run to sever any prior hardlink to the L2 store so castxml cannot corrupt a shared inode.

The constant is a key-algorithm version salt, not a storage format descriptor. Renaming clarifies that it belongs to the hash key computation and should not change when the storage format changes.

…E to ON Remove the hardlink restore path from _restore_xml() — shutil.copy2() is sufficient; disk space is not constrained enough to justify the POSIX-only os.link() complexity and cross-device fallback. gzip remains the default storage format (~253 MB for a full 807-module build vs 2.2 G uncompressed). Default ITK_WRAP_CASTXML_CACHE to ON so new build directories benefit from cross-dir L2 sharing without manual configuration. The cache location defaults to ~/.cache/itk-wrap; CI overrides via ITK_WRAP_CACHE.

Add .github/workflows/python.yml (ITK.Pixi.Python) to run the Python wrapping build on ubuntu-24.04, windows-2022, and macos-15. Mirror the ccache persistence pattern from Pixi-Cxx: restore before configure, save (if !cancelled) after build. Add a second castxml-v1 cache restore/save pair pointing at ${{ runner.temp }}/itk-castxml-cache, passed to the build via ITK_WRAP_CACHE. On a cold run the cache is seeded; on a warm run castxml is skipped for all 807 wrapped types — measured 6m37s vs 9m30s on a 72-core machine, larger speedup expected on 4-core CI runners where castxml is on the critical path. Add configure-python-ci, build-python-ci, and test-python-ci pixi tasks that mirror their non-CI counterparts but pass -DITK_WRAP_CASTXML_CACHE:BOOL=ON explicitly.

Add ITK_WRAP_CACHE pipeline variable and a Cache@2 restore task (castxml-v1 key) to ITK.Linux.Python, ITK.macOS.Python, and ITK.Windows.Python. The Cache@2 task mirrors the existing ccache pattern: restore before the build step, Azure DevOps automatically saves on post-job when the path is non-empty. ITK_WRAP_CASTXML_CACHE defaults to ON (set in itkWrapCastXMLCacheSupport.cmake), so the cache is active without any dashboard.cmake change.

Wrapping/CMakeLists.txt: include(itkWrapCastXMLCacheSupport) so ITK_WRAP_CASTXML_CACHE_SCRIPT is set for the condition guard in itk_auto_load_submodules.cmake; guarded by ITK_WRAP_PYTHON. python.yml: exclude windows-2022; itk_end_wrap_module.cmake produces an igenerator command exceeding cmd.exe's 8191-char batch-file line limit for large modules such as ITKImageIntensity (59 submodules). Pre-existing issue, unrelated to the castxml cache changes. Assisted-by: Claude Code — root-cause: missing include and Windows batch-file limit

Invalidates all existing v3 L2 entries (different hash prefix → different path → orphaned, pruned by LRU eviction) so the next build seeds fresh timing data for the 5-build overnight benchmark protocol. Co-Authored-By: Hans Johnson <hans.j.johnson@gmail.com>

The igenerator cache (ITK_IGENERATOR_CACHE / ~/.cache/itk-igenerator) was an incomplete implementation that saved .i/.idx/SwigInterface.h files but never saved .pkl files. On a warm cache hit _igenerator_restore() returned early, leaving the itk-pkl/ directory empty. pyi_generator then failed with "No pickle files were found". The itk-castxml-cache already covers the expensive CastXML step. igenerator itself is fast once the XML is available, so a separate layer adds complexity without benefit. Remove the six cache functions and their two call sites in main() entirely, restoring the original clean architecture.

igenerator.py writes N pkl files per module as side effects that ninja cannot track because their names (ClassName.SubmoduleName.pkl) are not enumerable at CMake configure time. When pkl files are deleted while the .index.txt byproducts survive, ninja considers igenerator up-to-date and pyi_generator.py fails with "No pickle files were found." Add a --pkl_stamp argument to igenerator.py. The stamp is written after all pkl files for the module are complete and is declared as a CMake OUTPUT of the igenerator add_custom_command. Ninja now re-runs igenerator whenever the stamp is absent, which guarantees the pkl files are regenerated before pyi_generator.py reads the .index.txt manifests.

Documents the two-phase pipeline (CMake configure → Ninja build) that converts .wrap files into .abi3.so modules and .pyi stubs. Covers: - Configure phase: how .wrap macros produce the three files written to castxml_inputs/ (.cxx, .castxml.inc, SwigInterface.h.in) - Build phase: CastXML (816 independent jobs) → igenerator.py (96 per-module jobs, no global barrier) → SWIG/compile/link → pyi_generator - Key file reference table mapping each file to its writer and reader - CastXML cache and ccache summary - Ninja dependency graph in ASCII - Troubleshooting section for the two most common failure modes

hjmjohnson marked this pull request as ready for review June 21, 2026 23:49

github-actions Bot added type:Infrastructure Infrastructure/ecosystem related changes, such as CMake or buildbots area:Python wrapping Python bindings for a class type:Testing Ensure that the purpose of a class is met/the results on a wide set of test cases are correct labels Jun 21, 2026

This comment was marked as resolved.

Sign in to view

greptile-apps Bot reviewed Jun 21, 2026

View reviewed changes

Comment thread Wrapping/Generators/CastXML/itk-castxml-cache.py

Comment thread .github/workflows/python.yml

Comment thread .github/workflows/python.yml

Comment thread pyproject.toml

github-actions Bot added the area:IO Issues affecting the IO module label Jun 22, 2026

hjmjohnson force-pushed the ci/linux-azure-disk-management branch from 3400645 to 20b1c6e Compare June 22, 2026 03:08

github-actions Bot removed the area:IO Issues affecting the IO module label Jun 22, 2026

github-actions Bot added the area:IO Issues affecting the IO module label Jun 22, 2026

hjmjohnson force-pushed the ci/linux-azure-disk-management branch from fba93ba to 47a29b7 Compare June 22, 2026 18:49

github-actions Bot removed the area:IO Issues affecting the IO module label Jun 22, 2026

hjmjohnson and others added 12 commits June 23, 2026 05:34

STYLE: Rename _CACHE_FMT to _KEY_VERSION in castxml cache

8f59774

The constant is a key-algorithm version salt, not a storage format descriptor. Renaming clarifies that it belongs to the hash key computation and should not change when the storage format changes.

hjmjohnson force-pushed the ci/linux-azure-disk-management branch from 47a29b7 to cbcd65a Compare June 23, 2026 10:39

github-actions Bot added the area:Documentation Issues affecting the Documentation module label Jun 23, 2026

hjmjohnson changed the title ~~WIP: CI TESTING ENH: Two-level CastXML/igenerator build cache + Python CI workflow~~ ENH: Two-level content-addressed CastXML cache + pkl stamp fix Jun 23, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ENH: Two-level content-addressed CastXML cache + pkl stamp fix#6486

ENH: Two-level content-addressed CastXML cache + pkl stamp fix#6486
hjmjohnson wants to merge 12 commits into
InsightSoftwareConsortium:mainfrom
hjmjohnson:ci/linux-azure-disk-management

hjmjohnson commented Jun 21, 2026 •

edited

Loading

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hjmjohnson commented Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

hjmjohnson commented Jun 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hjmjohnson commented Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

hjmjohnson commented Jun 21, 2026 •

edited

Loading